Chapter 16
Getting Straight Talk on Straight-Line
Regression
IN THIS CHAPTER
Determining when to use straight-line regression
Running a straight-line regression and making sense of the output
Examining results for issues and problems
Estimating needed sample size for straight-line regression
Chapter 15 refers to regression analyses in a general way. This chapter focuses on the simplest type of
regression analysis: straight-line regression. You can visualize it as fitting a straight line to the points
in a scatter plot from a set of data involving just two variables. Those two variables are generally
referred to as X and Y. The X variable is formally called the independent variable (or the predictor or
cause). The Y variable is called the dependent variable (or the outcome or effect).
Knowing When to Use Straight-Line Regression
You may see straight-line regression referred to in books and articles by several different
names, including linear regression, simple linear regression, linear univariate regression, and
linear bivariate regression. This abundance of references can be confusing, so we always use the
term straight-line regression.
Straight-line regression is appropriate when all of these things are true:
You’re interested in the relationship between two — and only two — numerical variables. At least
one of them must be a continuous variable that serves as the dependent variable (Y).
You’ve made a scatter plot of the two variables and the data points seem to lie, more or less, along
a straight line (as shown in Figures 16-1a and 16-1b). You shouldn’t try to fit a straight line to data
that appears to lie along a curved line (as shown in Figures 16-1c and 16-1d).
The data points appear to scatter randomly around the straight line over the entire range of the
chart, with no extreme outliers (as shown in Figures 16-1a and 16-1b).